In this initial Jupyter workflow within PyCCAPT, we will guide you through the process of cropping atom probe data, whether it's originally collected using PyCCAPT or in various other formats such as EPOS, POS, ATO, and CSV. This workflow is designed to help you efficiently manage your atom probe data, focusing on both temporal and spatial cropping techniques. Additionally, we will cover essential calculations, including raw MC (Mass-to-Charge Ratio), pulses per ion, and ions per pulse. Lastly, you can explore how to save the cropped data in a range of formats, including PyCCAPT's native HDF5 format, EPOS, POS, ATO, and CSV, to suit your specific needs and preferences.
# Activate intractive functionality of matplotlib
# Activate intractive functionality of matplotlib
%matplotlib ipympl
# Activate auto reload
%load_ext autoreload
%autoreload 2
%reload_ext autoreload
# import libraries
import os
import numpy as np
from ipywidgets import fixed
from ipywidgets import interact_manual
from ipywidgets import widgets
import warnings
# Ignore all warnings
warnings.filterwarnings("ignore")
# Local module and scripts
from pyccapt.calibration.calibration_tools import share_variables
from pyccapt.calibration.calibration_tools import widgets as wd
from pyccapt.calibration.data_tools import data_tools, data_loadcrop, dataset_path_qt
from pyccapt.calibration.mc import mc_tools, tof_tools
from pyccapt.calibration.calibration_tools import mc_plot
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
By clicking on the button below, you can select the dataset file you want to crop. The dataset file can be in various formats, including HDF5, EPOS, POS, ATO, and CSV. The cropped data will be saved in the same directory as the original dataset file in a new directory nammed load_crop. The name of the cropped dataset file will be the same as the original dataset file. The figures will be saved in the same directory as the dataset file.
button = widgets.Button(
description='load dataset',
)
@button.on_click
def open_file_on_click(b):
"""
Event handler for button click event.
Prompts the user to select a dataset file and stores the selected file path in the global variable dataset_path.
"""
global dataset_path
dataset_path = dataset_path_qt.gui_fname().decode('ASCII')
button
!conda install --yes --prefix {sys.prefix} pytables
# create object for selection of instrument specifications of the dataset
tdc, pulse_mode, flightPathLength_d, t0_d, max_mc, det_diam = wd.dataset_instrument_specification_selection()
# Display lists and comboboxes to selected instrument specifications
display(tdc, pulse_mode, flightPathLength_d, t0_d, max_mc)
# Calculate the maximum possible time of flight (TOF)
max_tof = int(tof_tools.mc2tof(max_mc.value, 1000, 0, 0, flightPathLength_d.value))
print('The maximum possible TOF is:', max_tof, 'ns')
print('=============================')
# create an instance of the Variables opject
variables = share_variables.Variables()
variables.pulse_mode = pulse_mode.value
dataset_main_path = os.path.dirname(dataset_path)
dataset_name_with_extention = os.path.basename(dataset_path)
variables.dataset_name = os.path.splitext(dataset_name_with_extention)[0]
variables.result_data_path = dataset_main_path + '/' + variables.dataset_name + '/load_crop/'
variables.result_data_name = variables.dataset_name
variables.result_path = dataset_main_path + '/' + variables.dataset_name + '/load_crop/'
if not os.path.isdir(variables.result_path):
os.makedirs(variables.result_path, mode=0o777, exist_ok=True)
print('The data will be saved on the path:', variables.result_data_path)
print('=============================')
print('The dataset name after saving is:', variables.result_data_name)
print('=============================')
print('The figures will be saved on the path:', variables.result_path)
print('=============================')
# Create data farame out of hdf5 file dataset
dld_group_storage = data_tools.load_data(dataset_path, tdc.value, mode='raw')
# Remove the data with tof greater thatn Max TOF or below 0 ns
data = data_tools.remove_invalid_data(dld_group_storage, max_tof)
print('Total number of Ions:', len(data))
The maximum possible TOF is: 5010 ns
=============================
The data will be saved on the path: D:/pyccapt/tests/data/data_1642_Aug-30-2023_16-05_Al_test4/load_crop/
=============================
The dataset name after saving is: data_1642_Aug-30-2023_16-05_Al_test4
=============================
The figures will be saved on the path: D:/pyccapt/tests/data/data_1642_Aug-30-2023_16-05_Al_test4/load_crop/
=============================
{'apt': ['high_voltage', 'main_chamber_vacuum', 'num_events', 'pulse', 'temperature', 'time_counter'], 'dld': ['high_voltage', 'pulse', 'start_counter', 't', 'x', 'y'], 'tdc': ['channel', 'high_voltage', 'pulse', 'start_counter', 'time_data'], 'time': ['time_h', 'time_m', 'time_s']}
The number of data over max_tof: 245
Total number of Ions: 12312751
data
| high_voltage (V) | pulse | start_counter | t (ns) | x_det (cm) | y_det (cm) | |
|---|---|---|---|---|---|---|
| 0 | 600.000000 | 328.0 | 8202 | 2537.802979 | 1.080816 | 0.006531 |
| 1 | 615.000000 | 328.0 | 14741 | 3686.929443 | 1.443265 | -1.812245 |
| 2 | 624.979980 | 328.0 | 2657 | 3110.466553 | -0.688980 | -2.249796 |
| 3 | 624.979980 | 328.0 | 4568 | 1171.380737 | 0.192653 | -0.914286 |
| 4 | 634.919983 | 328.0 | 4498 | 2703.307129 | 0.058776 | 1.479184 |
| ... | ... | ... | ... | ... | ... | ... |
| 12312746 | 8000.000000 | 1600.0 | 11089 | 3722.090332 | 2.282449 | 2.798367 |
| 12312747 | 8000.000000 | 1600.0 | 13935 | 3065.292725 | 3.725714 | -0.675918 |
| 12312748 | 8000.000000 | 1600.0 | 2722 | 2561.627686 | 3.229388 | 1.573878 |
| 12312749 | 8000.000000 | 1600.0 | 3387 | 3579.656494 | 0.414694 | 2.693877 |
| 12312750 | 8000.000000 | 1600.0 | 14288 | 2206.904297 | 1.244082 | -2.847347 |
12312751 rows × 6 columns
Select the data by drawing a rectangle over the experiment history. Experiment history is a 2D histogram of the time of flight of the ions versus sequence of evaporation. The experiment history is plotted by clicking on the button below te cell.
interact_manual(data_loadcrop.plot_crop_experiment_history, data=fixed(data), variables=fixed(variables), max_tof=widgets.FloatText(value=max_tof), frac=widgets.FloatText(value=1.0),
bins=fixed((1200,800)), figure_size=fixed((7,3)),
draw_rect=fixed(False), data_crop=fixed(True), pulse_plot=widgets.Dropdown(options=[('False', False), ('True', True)]), dc_plot=widgets.Dropdown(options=[('True', True), ('False', False)]),
pulse_mode=widgets.Dropdown(options=[('voltage', 'voltage'), ('laser', 'laser')]), save=widgets.Dropdown(options=[('True', True), ('False', False)]),
figname=widgets.Text(value='hist_ini'));
Boundaries of the selected(cropped) part of the graph is printed below
# Plot and selected experiment history
interact_manual(data_loadcrop.plot_crop_experiment_history, data=fixed(data), variables=fixed(variables), max_tof=widgets.FloatText(value=max_tof), frac=widgets.FloatText(value=1.0),
bins=fixed((1200,800)), figure_size=fixed((7,3)),
draw_rect=fixed(True), data_crop=fixed(False), pulse_plot=widgets.Dropdown(options=[('False', False), ('True', True)]), dc_plot=widgets.Dropdown(options=[('True', True), ('False', False)]),
pulse_mode=widgets.Dropdown(options=[('voltage', 'voltage'), ('laser', 'laser')]), save=widgets.Dropdown(options=[('True', True), ('False', False)]),
figname=widgets.Text(value='hist_rect'));
# Crop the dataset
print('Min Idx:', variables.selected_x1, 'Max Idx:', variables.selected_x2)
data_crop_t = data_loadcrop.crop_dataset(data, variables)
Min Idx: 102238.61102259532 Max Idx: 11999937.47063573
Select the region of maximum concentration of Ions in the below plotted graph to utilize relevant data. To crop you can draw a circle over the filed desorption map. The field desorption map is a 2D histogram of the time of flight of the ions versus the position of the ions on the detector. The field desorption map is plotted by clicking on the button below the cell.
# Plot and select the FDM
interact_manual(data_loadcrop.plot_crop_fdm, data=fixed(data_crop_t), variables=fixed(variables), frac=widgets.FloatText(value=1.0),
bins=fixed((256,256)), figure_size=fixed((5,4)),
draw_circle=fixed(False), data_crop=fixed(True),
save=widgets.Dropdown(options=[('True', True), ('False', False)]),
figname=widgets.Text(value='fdm_ini'));
The region selected in the previous step is displayed below.
# plot selected area in FDM
interact_manual(data_loadcrop.plot_crop_fdm, data=fixed(data_crop_t), variables=fixed(variables), frac=widgets.FloatText(value=1.0),
bins=fixed((256,256)), figure_size=fixed((5,4)),
draw_circle=fixed(True), data_crop=fixed(False),
save=widgets.Dropdown(options=[('True', True), ('False', False)]),
figname
=widgets.Text(value='fdm_circle'));
# Crop the dataset
print('center x:', variables.selected_x_fdm, 'center y:', variables.selected_y_fdm)
print('Radios:', variables.roi_fdm)
if variables.roi_fdm > 0:
data_crop_spatial = data_loadcrop.crop_data_after_selection(data_crop_t, variables)
else:
print('select the data spacialy from cell below')
center x: 0.38342100184526995 center y: 0.35426403432323594 Radios: 3.3414969148944715
The final selected data after processing is shown below.
# Crop and plot the dataset
interact_manual(data_loadcrop.plot_crop_fdm, data=fixed(data_crop_spatial), variables=fixed(variables), frac=widgets.FloatText(value=1.0),
bins=fixed((256,256)), figure_size=fixed((5,4)),
draw_circle=fixed(False), data_crop=fixed(False),
save=widgets.Dropdown(options=[('True', True), ('False', False)]),
figname=widgets.Text(value='fdm'));
Calculate pulses since the last event pulse and ions per pulse.
pulse_pi, ion_pp = data_loadcrop.calculate_ppi_and_ipp(data_crop_spatial)
# add two calculated array to the croped dataset
data_crop_spatial['pulse_pi'] = pulse_pi.astype(np.uintc)
data_crop_spatial['ion_pp'] = ion_pp.astype(np.uintc)
The percentage of loss in ROI selection process.
# save the cropped data
print('tof Crop Loss {:.2f} %'.format((100 - (len(data_crop_spatial) / len(data)) * 100)))
#percentage of double event per pulse
print('percentage of double event per pulse', len(ion_pp[ion_pp != 1]) / float(len(ion_pp)))
tof Crop Loss 11.56 % percentage of double event per pulse 0.018185610167477363
# exctract needed data from Pandas data frame as an numpy array
variables.dld_high_voltage = data_crop_spatial['high_voltage (V)'].to_numpy()
variables.dld_pulse = data_crop_spatial['pulse'].to_numpy()
variables.dld_t = data_crop_spatial['t (ns)'].to_numpy()
variables.dld_x = data_crop_spatial['x_det (cm)'].to_numpy()
variables.dld_y = data_crop_spatial['y_det (cm)'].to_numpy()
In the next cell by changing the t0 value you can correct the position of H1. this correction would be helpful for the position of the peaks in the m/c calibration process.
def fine_tune_t_0(variables, t0_d, bin_size, log, mode, target, prominence, distance, percent, figname, lim):
variables.mc = mc_tools.tof2mc(variables.dld_t, t0_d, variables.dld_high_voltage, variables.dld_x, variables.dld_y, flightPathLength_d.value,
variables.dld_pulse, mode=pulse_mode.value)
if target == 'mc':
mc_hist = mc_plot.AptHistPlotter(variables.mc[variables.mc < lim], variables)
mc_hist.plot_histogram(bin_width=bin_size, mode=mode, label='mc', steps='stepfilled', log=log, fig_size=(9, 5))
elif target == 'tof':
mc_hist = mc_plot.AptHistPlotter(variables.dld_t[variables.dld_t < lim], variables)
mc_hist.plot_histogram(bin_width=bin_size, mode=mode, label='tof', steps='stepfilled', log=log, fig_size=(9, 5))
if mode != 'normalized':
mc_hist.find_peaks_and_widths(prominence=prominence, distance=distance, percent=percent)
mc_hist.plot_peaks()
mc_hist.plot_hist_info_legend(label='mc', bin=0.1, background=None, loc='right')
mc_hist.save_fig(label=mode, fig_name=figname)
interact_manual(fine_tune_t_0, variables=fixed(variables), t0_d=widgets.FloatText(value=t0_d.value), bin_size=widgets.FloatText(value=0.1),
log=widgets.Dropdown(options=[('True', True), ('False', False)]), mode=widgets.Dropdown(options=[('normal', 'normal'), ('normalized', 'normalized')]),
target=widgets.Dropdown(options=[('mc', 'mc'), ('tof', 'tof')]), prominence=widgets.IntText(value=10), distance=widgets.IntText(value=100),
lim=widgets.IntText(value=400), percent=widgets.IntText(value=50), figname=widgets.Text(value='hist'));
data_crop_spatial_back = data_crop_spatial.copy()
data_crop_spatial_back.insert(0, 'x (nm)', np.zeros(len(variables.dld_t)))
data_crop_spatial_back.insert(1, 'y (nm)', np.zeros(len(variables.dld_t)))
data_crop_spatial_back.insert(2,'z (nm)', np.zeros(len(variables.dld_t)))
data_crop_spatial_back.insert(3,'mc_c (Da)', np.zeros(len(variables.dld_t)))
data_crop_spatial_back.insert(4, 'mc (Da)', variables.mc)
data_crop_spatial_back.insert(8,'t_c (ns)', np.zeros(len(variables.dld_t)))
Remove the data with m/c greater than max m/c and x, y, t = 0
# Remove the data with mc biger than max mc
mask = (data_crop_spatial_back['mc (Da)'].to_numpy() > max_mc.value)
print('The number of data over max_mc:', len(mask[mask==True]))
data_crop_spatial_back.drop(np.where(mask)[0], inplace=True)
data_crop_spatial_back.reset_index(inplace=True, drop=True)
# Remove the data with x,y,t = 0
mask1 = (data_crop_spatial_back['x (nm)'].to_numpy() == 0)
mask2 = (data_crop_spatial_back['y (nm)'].to_numpy() == 0)
mask3 = (data_crop_spatial_back['t (ns)'].to_numpy() == 0)
mask = np.logical_and(mask1, mask2)
mask = np.logical_and(mask, mask3)
print('The number of data with having t, x, and y equal to zero is:', len(mask[mask==True]))
data_crop_spatial_back.drop(np.where(mask)[0], inplace=True)
data_crop_spatial_back.reset_index(inplace=True, drop=True)
The number of data over max_mc: 687885 The number of data with having t, x, and y equal to zero is: 0
The final cropped dataset is displayed below.
data_crop_spatial_back
| x (nm) | y (nm) | z (nm) | mc_c (Da) | mc (Da) | high_voltage (V) | pulse | start_counter | t_c (ns) | t (ns) | x_det (cm) | y_det (cm) | pulse_pi | ion_pp | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 0.0 | 0.0 | 0.0 | 14.136714 | 5019.720215 | 1003.943970 | 3495 | 0.0 | 446.853577 | 2.964898 | -0.169796 | 0 | 0 |
| 1 | 0.0 | 0.0 | 0.0 | 0.0 | 29.616535 | 5019.720215 | 1003.943970 | 3565 | 0.0 | 616.451904 | -1.936327 | 0.088163 | 70 | 2 |
| 2 | 0.0 | 0.0 | 0.0 | 0.0 | 30.456534 | 5019.720215 | 1003.943970 | 4103 | 0.0 | 623.172729 | -1.648980 | 0.672653 | 538 | 1 |
| 3 | 0.0 | 0.0 | 0.0 | 0.0 | 29.550233 | 5019.720215 | 1003.943970 | 4134 | 0.0 | 616.253052 | -1.975510 | -0.218776 | 31 | 1 |
| 4 | 0.0 | 0.0 | 0.0 | 0.0 | 28.966265 | 5019.720215 | 1003.943970 | 4205 | 0.0 | 605.431091 | 1.296327 | -0.173061 | 71 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 10201104 | 0.0 | 0.0 | 0.0 | 0.0 | 141.314648 | 6347.270020 | 1269.453979 | 8768 | 0.0 | 1154.633423 | -0.734694 | -1.577143 | 403 | 1 |
| 10201105 | 0.0 | 0.0 | 0.0 | 0.0 | 29.461296 | 6347.270020 | 1269.453979 | 8799 | 0.0 | 548.825195 | 0.408163 | 1.508571 | 31 | 1 |
| 10201106 | 0.0 | 0.0 | 0.0 | 0.0 | 29.600126 | 6347.270020 | 1269.453979 | 9443 | 0.0 | 547.803345 | -1.165714 | 0.097959 | 485 | 1 |
| 10201107 | 0.0 | 0.0 | 0.0 | 0.0 | 29.719453 | 6347.270020 | 1269.453979 | 9771 | 0.0 | 555.038513 | -1.645714 | 1.296327 | 328 | 1 |
| 10201108 | 0.0 | 0.0 | 0.0 | 0.0 | 29.823040 | 6347.270020 | 1269.453979 | 10327 | 0.0 | 556.574707 | -0.982857 | 1.933061 | 556 | 1 |
10201109 rows × 14 columns
The data types of the final cropped dataset is displayed below.
data_crop_spatial_back.dtypes
x (nm) float64 y (nm) float64 z (nm) float64 mc_c (Da) float64 mc (Da) float64 high_voltage (V) float64 pulse float64 start_counter uint32 t_c (ns) float64 t (ns) float64 x_det (cm) float64 y_det (cm) float64 pulse_pi uint32 ion_pp uint32 dtype: object
Save the cropped dataset. You can specify te output format from list below. The output formats are HDF5, EPOS, POS, ATO, and CSV. The output file will be saved in the same directory as the original dataset file in a new directory nammed load_crop.
interact_manual(data_tools.save_data, data=fixed(data_crop_spatial_back), variables=fixed(variables),
hdf=widgets.Dropdown(options=[('True', True), ('False', False)]),
epos=widgets.Dropdown(options=[('False', False), ('True', True)]),
pos=widgets.Dropdown(options=[('False', False), ('True', True)]),
ato_6v=widgets.Dropdown(options=[('False', False), ('True', True)]),
csv=widgets.Dropdown(options=[('False', False), ('True', True)]));